Recurrent Spatial Transformer Networks

نویسندگان

  • Søren Kaae Sønderby
  • Casper Kaae Sønderby
  • Lars Maaløe
  • Ole Winther
چکیده

We integrate the recently proposed spatial transformer network (SPN) (Jaderberg & Simonyan, 2015) into a recurrent neural network (RNN) to form an RNN-SPN model. We use the RNNSPN to classify digits in cluttered MNIST sequences. The proposed model achieves a single digit error of 1.5% compared to 2.9% for a convolutional networks and 2.0% for convolutional networks with SPN layers. The SPN outputs a zoomed, rotated and skewed version of the input image. We investigate different down-sampling factors (ratio of pixel in input and output) for the SPN and show that the RNN-SPN model is able to down-sample the input images without deteriorating performance. The down-sampling in RNN-SPN can be thought of as adaptive downsampling that minimizes the information loss in the regions of interest. We attribute the superior performance of the RNN-SPN to the fact that it can attend to a sequence of regions of interest.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Recurrent 3D Attentional Networks for End-to-End Active Object Recognition in Cluttered Scenes

Active vision is inherently attention-driven: The agent selects views of observation to best approach the vision task while improving its internal representation of the scene being observed. Inspired by the recent success of attentionbased models in 2D vision tasks based on single RGB images, we propose to address the multi-view depth-based active object recognition using attention mechanism, t...

متن کامل

Deep Tracking on the Move: Learning to Track the World from a Moving Vehicle using Recurrent Neural Networks

This paper presents an end-to-end approach for tracking static and dynamic objects for an autonomous vehicle driving through crowded urban environments. Unlike traditional approaches to tracking, this method is learned end-to-end, and is able to directly predict a full unoccluded occupancy grid map from raw laser input data. Inspired by the recently presented DeepTracking approach ([1], [2]), w...

متن کامل

Multi-objective Based Optimization Using Tap Setting Transformer, DG and Capacitor Placement in Distribution Networks

In this article, a multi-objective function for placement of Distributed Generation (DG) and capacitors with thetap setting of Under Load Tap Changer (ULTC) Transformer is introduced. Most of the recent articles have paidless attention to DG, capacitor placement and ULTC effects in the distribution network simultaneously. Insimulations, a comparison between different modes was carried out with,...

متن کامل

Traffic Sign Classification Using Deep Inception Based Convolutional Networks

In this work, we propose a novel deep networks for traffic sign classification that achieves outstanding performance on GTSRB surpassing all previous methods. Our deep network consists of spatial transformer layers and a modified version of inception module specifically designed for capturing local and global features together. This features adoption allows our network to classify precisely int...

متن کامل

Detection of Transformer Winding Faults Using Wavelet Analysis and Neural Network

This paper investigates the application of wavelet transform as a preprocessor for neural networks (NN) in identifying internal turn-to-turn faults in transformer windings. The faulty and normal signals generated by numerical simulation of ElectroMagnetic Transient Program (EMTP) are preprocessed using discrete wavelet transform (DWT). The mean values of the wavelet coefficients are input into ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1509.05329  شماره 

صفحات  -

تاریخ انتشار 2015